Evolving Consensus Sequences with a Genetic Algorithm

نویسندگان

  • Conrad Shyu
  • James A. Foster
چکیده

In this paper we present an approach that employs a genetic algorithm (GA) to evolve the consensus sequences for DNA and protein sequences alignments. We have developed an encoding scheme such that the number of generations needed to find the optimal solution remains approximately the same regardless the number of sequences. The complexity instead depends only on the length of the consensus sequence and similarities among sequences. The objective function gives the sum-of-pairs (SP) scores that serve as the fitness values. We have devised a residue profiling technique that further simplifies the calculations of the SP scores. Furthermore, to facilitate the quantitative studies of our GA approach, we have developed a simulation program that incorporates the most commonly used evolutionary models and generates biologically sound sequences. We performed several experiments and compared the results with the most commonly used heuristic alignment program Clustal W [18, 19]. We conclude our research with detailed analysis and demonstrate that our GA approach offers an attractive and competitive alternative to the heuristic approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolving Consensus Sequence for Multiple Sequence Alignment with a Genetic Algorithm

In this paper we present an approach that evolves the consensus sequence [25] for multiple sequence alignment (MSA) with genetic algorithm (GA). We have developed an encoding scheme such that the number of generations needed to find the optimal solution is approximately the same regardless the number of sequences. Instead it only depends on the length of the template and similarity between sequ...

متن کامل

Automated Discovery of Protein Motifs With Genetic Programming

Automated methods of machine learning may prove to be useful in discovering biologically meaningful information hidden in the rapidly growing databases of DNA sequences and protein sequences. Genetic programming is an extension of the genetic algorithm in which a population of computer programs is bred, over a series of generations, in order to solve a problem. Genetic programming is capable of...

متن کامل

Mitochondrial DNA variation in wild and hatchery populations of northern pike, Esox lucius L.

Esox lucius is an economically important freshwater species. Mitochondrial cytb, 12SrRNA, and 16SrRNA gene sequences were used in order to clarify the genetic variation and population structure in three E. Lucius populations, i.e., one Wild population (W) and two hatchery populations (Hatchery Population I-HPI and Hatchery Population II-HPII). A total of 55 individuals, with 19 from wild and 1...

متن کامل

Mitochondrial DNA variation in wild and hatchery populations of northern pike, Esox lucius L.

Esox lucius is an economically important freshwater species. Mitochondrial cytb, 12SrRNA, and 16SrRNA gene sequences were used in order to clarify the genetic variation and population structure in three E. Lucius populations, i.e., one Wild population (W) and two hatchery populations (Hatchery Population I-HPI and Hatchery Population II-HPII). A total of 55 individuals, with 19 from wild and 1...

متن کامل

An Effective Hybrid Genetic Algorithm for Hybrid Flow Shops with Sequence Dependent Setup Times and Processor Blocking

Hybrid flow-shop or flexible flow shop problems have remained subject of intensive research over several years. Hybrid flow-shop problems overcome one of the limitations of the classical flow-shop model by allowing parallel processors at each stage of task processing. In many papers the assumptions are generally made that there is unlimited storage available between stages and the setup times a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003